-
-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mooncake Inside Problems #1152
Mooncake Inside Problems #1152
Conversation
Okay. I believe this is now ready for review. |
Note: formatting failures appear unrelated to the changes in this PR. |
@@ -211,6 +211,9 @@ function adjointdiffcache(g::G, sensealg, discrete, sol, dgdu::DG1, dgdp::DG2, f | |||
paramjac_config = get_paramjac_config(autojacvec, p, f, y, _p, _t; numindvar, alg) | |||
pf = get_pf(autojacvec; _f = unwrappedf, isinplace = isinplace, isRODE = isRODE) | |||
paramjac_config = (paramjac_config..., Enzyme.make_zero(pf)) | |||
elseif autojacvec isa MooncakeVJP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChrisRackauckas, it is probably better to refactor these hard-coded branches (e.g., define an interface function that other packages can overload). It would help
- autograd tools to integrate with SciMLSensitivity easily
- move some existing autograd glue code into package extensions to avoid hard deps
It might also help to switch to DI where possible to avoid duplicate glue code in the ecosystem. @gdalle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
autograd tools to integrate with SciMLSensitivity easily
Is that really a high priority right now? How many more autograd packages are you going to write this year that will be useful?
move some existing autograd glue code into package extensions to avoid hard deps
Doesn't doesn't necessarily make sense. Most of the methods are used in the default method so they would be required to be loaded by default anyways?
It might also help to switch to DI where possible to avoid duplicate glue code in the ecosystem. @gdalle
That's the plan when it's able to handle this case well. Currently it's not able to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that really a high priority right now? How many more autograd packages are you going to write this year that will be useful?
Mooncake is getting a new forward mode (an attempt to improve ForwardDiff with GPU compatibility and fewer constraints; see here for more details), so @willtebbutt will likely need to modify these again in the near term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that just require modifying https://github.com/SciML/SciMLSensitivity.jl/pull/1152/files#diff-1a15b4b5711133c125548ef7f1ca88f761bb124cffc8bfde8c13336968aaccd6R466 ? I don't see why that would touch this function and instead just dispatch on there.
I mean, if someone wants to do a refactor here that's perfectly fine. But I also don't see why it would be a high priority since it's not like new AD systems get added every year, and modifications to existing ones don't really touch this part of the code much. I would think the time would be better spent just trying to get DI up to speed than refactoring this old code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can discuss DI integration in #1040 if you want
src/adjoint_common.jl
Outdated
_p = Mooncake.CoDual(p, Mooncake.set_to_zero!!(p_grad)) | ||
_t = Mooncake.zero_codual(t) | ||
λ_mem .= λ | ||
dy, _ = Mooncake.__value_and_pullback!!(rule, λ_mem, _pf, _dy_mem, _y, _p, _t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the allocation of the dy
necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not -- I'm working on a Mooncake PR at the minute that should make this redundant. If I manage to get that out of the door before we're ready to merge this I'll modify this PR. Otherwise I'll make a small non-breaking follow-up next week.
This looks good! Though my main question is why the vector is returned since it's just mutated and so normally we just
The docstring just needs to be added to the list in the manual part of the docs, with a quick description (which I believe is already in there) |
Thanks for the advice @ChrisRackauckas . I'm still slightly concerned that the tests don't appear to ever hit the out-of-place ODE implementations (the |
The next set right after is for that. |
Ohhhhhhh I see. 🤦 Thanks for the pointer. I'll add Mooncake to those tests tomorrow. |
Tests added -- @ChrisRackauckas I think CI should pass if you trigger a run. |
Does this require #1151 to be merged first? |
No, those two are completely independent. |
This looks good to go, though we should discuss the extension / default alg part before committing to taking it on as a hard dep, since currently we're looking to go in the other direction as much as possible! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Awesome!
Great. I've bumped the minor version, and ensured that the formatter is happy with all of the files I've touched. I'm happy for you to merge whenever is convenient for you. |
Checklist
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Additional context
This addresses #1105
It's not quite ready for review yet, but is almost there. In particular I need toI've verified locally that the performance on the cases in
test/adjoint.jl
is competitive with Enzyme.jl, which is nice.